Overview

Dataset statistics

Number of variables32
Number of observations150
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory126.7 KiB
Average record size in memory864.8 B

Variable types

NUM16
CAT15
BOOL1

Reproduction

Analysis started2022-05-05 18:59:49.857236
Analysis finished2022-05-05 19:00:37.175140
Duration47.32 seconds
Versionpandas-profiling v2.7.1
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Gender_father has constant value "1" Constant
Gender_mother has constant value "2" Constant
FVC_middle_child has a high cardinality: 67 distinct values High cardinality
FEV1_middle_child has a high cardinality: 67 distinct values High cardinality
AREA_CODE is highly correlated with IDHigh correlation
ID is highly correlated with AREA_CODEHigh correlation
FEV1_father is highly correlated with FVC_fatherHigh correlation
FVC_father is highly correlated with FEV1_fatherHigh correlation
Age_mother is highly correlated with Age_fatherHigh correlation
Age_father is highly correlated with Age_motherHigh correlation
Height_oldest_child is highly correlated with Age_oldest_child and 3 other fieldsHigh correlation
Age_oldest_child is highly correlated with Height_oldest_childHigh correlation
Weight_oldest_child is highly correlated with Height_oldest_childHigh correlation
FVC_oldest_child is highly correlated with Height_oldest_child and 1 other fieldsHigh correlation
FEV1_oldest_child is highly correlated with Height_oldest_child and 1 other fieldsHigh correlation
Weight_youngest_child is highly correlated with Age_youngest_child and 3 other fieldsHigh correlation
Age_youngest_child is highly correlated with Weight_youngest_child and 1 other fieldsHigh correlation
Height_youngest_child is highly correlated with Weight_youngest_child and 2 other fieldsHigh correlation
FVC_youngest_child is highly correlated with Sex_youngest_child and 3 other fieldsHigh correlation
Sex_youngest_child is highly correlated with FVC_youngest_child and 1 other fieldsHigh correlation
FEV1_youngest_child is highly correlated with Sex_youngest_child and 4 other fieldsHigh correlation
ID is uniformly distributed Uniform
Sex_oldest_child is uniformly distributed Uniform
ID has unique values Unique

Variables

ID
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE
Distinct count150
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75.5
Minimum1
Maximum150
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum1
5-th percentile8.45
Q138.25
median75.5
Q3112.75
95-th percentile142.55
Maximum150
Range149
Interquartile range (IQR)74.5

Descriptive statistics

Standard deviation43.44536799
Coefficient of variation (CV)0.5754353376
Kurtosis-1.2
Mean75.5
Median Absolute Deviation (MAD)37.5
Skewness0
Sum11325
Variance1887.5
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 1 0.7%
 
95 1 0.7%
 
97 1 0.7%
 
98 1 0.7%
 
99 1 0.7%
 
100 1 0.7%
 
101 1 0.7%
 
102 1 0.7%
 
103 1 0.7%
 
104 1 0.7%
 
Other values (140) 140 93.3%
 
ValueCountFrequency (%) 
1 1 0.7%
 
2 1 0.7%
 
3 1 0.7%
 
4 1 0.7%
 
5 1 0.7%
 
ValueCountFrequency (%) 
150 1 0.7%
 
149 1 0.7%
 
148 1 0.7%
 
147 1 0.7%
 
146 1 0.7%
 

AREA_CODE
Categorical

HIGH CORRELATION
Distinct count4
Unique (%)2.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
4
58
2
49
1
24
3
19
ValueCountFrequency (%) 
4 58 38.7%
 
2 49 32.7%
 
1 24 16.0%
 
3 19 12.7%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 4 100.0%
 
ValueCountFrequency (%) 
Common 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

Gender_father
Boolean

CONSTANT
REJECTED
Distinct count1
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
1
150
ValueCountFrequency (%) 
1 150 100.0%
 

Age_father
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count29
Unique (%)19.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.13333333333333
Minimum26
Maximum59
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum26
5-th percentile30
Q135
median40
Q344.75
95-th percentile52.55
Maximum59
Range33
Interquartile range (IQR)9.75

Descriptive statistics

Standard deviation6.889995341
Coefficient of variation (CV)0.1716776248
Kurtosis-0.4385605078
Mean40.13333333
Median Absolute Deviation (MAD)5
Skewness0.29431599
Sum6020
Variance47.47203579
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
40 12 8.0%
 
33 9 6.0%
 
44 9 6.0%
 
36 8 5.3%
 
37 8 5.3%
 
39 8 5.3%
 
42 8 5.3%
 
43 7 4.7%
 
46 7 4.7%
 
34 7 4.7%
 
Other values (19) 67 44.7%
 
ValueCountFrequency (%) 
26 2 1.3%
 
28 3 2.0%
 
29 1 0.7%
 
30 5 3.3%
 
31 3 2.0%
 
ValueCountFrequency (%) 
59 1 0.7%
 
54 4 2.7%
 
53 3 2.0%
 
52 4 2.7%
 
51 3 2.0%
 

Height_father_inch
Real number (ℝ≥0)

Distinct count15
Unique (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.26
Minimum61
Maximum76
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum61
5-th percentile65
Q167.25
median69
Q371
95-th percentile74
Maximum76
Range15
Interquartile range (IQR)3.75

Descriptive statistics

Standard deviation2.779189201
Coefficient of variation (CV)0.04012690155
Kurtosis0.1487147442
Mean69.26
Median Absolute Deviation (MAD)2
Skewness-0.1873708821
Sum10389
Variance7.723892617
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
69 23 15.3%
 
68 20 13.3%
 
70 20 13.3%
 
71 18 12.0%
 
66 14 9.3%
 
67 14 9.3%
 
72 11 7.3%
 
73 10 6.7%
 
74 7 4.7%
 
64 4 2.7%
 
Other values (5) 9 6.0%
 
ValueCountFrequency (%) 
61 2 1.3%
 
63 1 0.7%
 
64 4 2.7%
 
65 3 2.0%
 
66 14 9.3%
 
ValueCountFrequency (%) 
76 1 0.7%
 
75 2 1.3%
 
74 7 4.7%
 
73 10 6.7%
 
72 11 7.3%
 

Weight_father_lb
Real number (ℝ≥0)

Distinct count75
Unique (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean182.08666666666667
Minimum121
Maximum245
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum121
5-th percentile145
Q1166
median180
Q3198
95-th percentile222.55
Maximum245
Range124
Interquartile range (IQR)32

Descriptive statistics

Standard deviation23.95407706
Coefficient of variation (CV)0.1315531636
Kurtosis-0.2352150603
Mean182.0866667
Median Absolute Deviation (MAD)16
Skewness0.07883330816
Sum27313
Variance573.7978076
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
198 6 4.0%
 
178 5 3.3%
 
180 4 2.7%
 
194 4 2.7%
 
191 4 2.7%
 
187 4 2.7%
 
176 4 2.7%
 
209 4 2.7%
 
172 4 2.7%
 
169 4 2.7%
 
Other values (65) 107 71.3%
 
ValueCountFrequency (%) 
121 1 0.7%
 
129 1 0.7%
 
131 1 0.7%
 
132 1 0.7%
 
138 1 0.7%
 
ValueCountFrequency (%) 
245 1 0.7%
 
235 1 0.7%
 
234 1 0.7%
 
232 1 0.7%
 
228 1 0.7%
 

FVC_father
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count113
Unique (%)75.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean495.23333333333335
Minimum302
Maximum666
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum302
5-th percentile368.6
Q1441
median495.5
Q3549.75
95-th percentile631.7
Maximum666
Range364
Interquartile range (IQR)108.75

Descriptive statistics

Standard deviation79.36699369
Coefficient of variation (CV)0.1602618167
Kurtosis-0.5196151593
Mean495.2333333
Median Absolute Deviation (MAD)54.5
Skewness0.05547744947
Sum74285
Variance6299.119687
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
512 3 2.0%
 
391 3 2.0%
 
462 3 2.0%
 
452 3 2.0%
 
441 3 2.0%
 
509 3 2.0%
 
473 3 2.0%
 
409 2 1.3%
 
407 2 1.3%
 
513 2 1.3%
 
Other values (103) 123 82.0%
 
ValueCountFrequency (%) 
302 1 0.7%
 
319 1 0.7%
 
340 1 0.7%
 
345 1 0.7%
 
351 1 0.7%
 
ValueCountFrequency (%) 
666 1 0.7%
 
657 1 0.7%
 
652 1 0.7%
 
651 1 0.7%
 
650 1 0.7%
 

FEV1_father
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count115
Unique (%)76.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean409.32666666666665
Minimum250
Maximum585
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum250
5-th percentile305.45
Q1366.75
median409
Q3451
95-th percentile501
Maximum585
Range335
Interquartile range (IQR)84.25

Descriptive statistics

Standard deviation65.07522716
Coefficient of variation (CV)0.1589811573
Kurtosis-0.1772712914
Mean409.3266667
Median Absolute Deviation (MAD)42
Skewness0.0155962472
Sum61399
Variance4234.78519
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
329 3 2.0%
 
451 3 2.0%
 
399 3 2.0%
 
385 3 2.0%
 
409 3 2.0%
 
450 3 2.0%
 
398 3 2.0%
 
372 2 1.3%
 
374 2 1.3%
 
445 2 1.3%
 
Other values (105) 123 82.0%
 
ValueCountFrequency (%) 
250 1 0.7%
 
251 1 0.7%
 
280 1 0.7%
 
285 1 0.7%
 
290 1 0.7%
 
ValueCountFrequency (%) 
585 1 0.7%
 
560 1 0.7%
 
558 1 0.7%
 
545 1 0.7%
 
537 1 0.7%
 

Gender_mother
Categorical

CONSTANT
REJECTED
Distinct count1
Unique (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2
150
ValueCountFrequency (%) 
2 150 100.0%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 1 100.0%
 
ValueCountFrequency (%) 
Common 1 100.0%
 
ValueCountFrequency (%) 
ASCII 1 100.0%
 

Age_mother
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count28
Unique (%)18.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.56
Minimum26
Maximum56
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum26
5-th percentile28.45
Q132
median36.5
Q342
95-th percentile49.55
Maximum56
Range30
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.714184124
Coefficient of variation (CV)0.1787588958
Kurtosis-0.2379106482
Mean37.56
Median Absolute Deviation (MAD)4.5
Skewness0.5514718035
Sum5634
Variance45.08026846
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
36 12 8.0%
 
32 11 7.3%
 
40 9 6.0%
 
30 9 6.0%
 
35 9 6.0%
 
42 8 5.3%
 
38 8 5.3%
 
34 7 4.7%
 
37 7 4.7%
 
31 7 4.7%
 
Other values (18) 63 42.0%
 
ValueCountFrequency (%) 
26 2 1.3%
 
27 4 2.7%
 
28 2 1.3%
 
29 6 4.0%
 
30 9 6.0%
 
ValueCountFrequency (%) 
56 2 1.3%
 
53 1 0.7%
 
52 3 2.0%
 
50 2 1.3%
 
49 3 2.0%
 

Height_mother_inch
Real number (ℝ≥0)

Distinct count13
Unique (%)8.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.09333333333333
Minimum57
Maximum69
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum57
5-th percentile60.45
Q162
median64
Q366
95-th percentile68
Maximum69
Range12
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.469536996
Coefficient of variation (CV)0.03853032551
Kurtosis-0.1182924775
Mean64.09333333
Median Absolute Deviation (MAD)2
Skewness-0.2414292837
Sum9614
Variance6.098612975
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
62 26 17.3%
 
65 24 16.0%
 
64 20 13.3%
 
66 18 12.0%
 
63 16 10.7%
 
67 15 10.0%
 
61 11 7.3%
 
68 8 5.3%
 
69 4 2.7%
 
60 3 2.0%
 
Other values (3) 5 3.3%
 
ValueCountFrequency (%) 
57 2 1.3%
 
58 1 0.7%
 
59 2 1.3%
 
60 3 2.0%
 
61 11 7.3%
 
ValueCountFrequency (%) 
69 4 2.7%
 
68 8 5.3%
 
67 15 10.0%
 
66 18 12.0%
 
65 24 16.0%
 

Weight_mother_lb
Real number (ℝ≥0)

Distinct count72
Unique (%)48.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean146.97333333333333
Minimum90
Maximum267
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum90
5-th percentile110
Q1128
median140.5
Q3159
95-th percentile208.75
Maximum267
Range177
Interquartile range (IQR)31

Descriptive statistics

Standard deviation30.95568312
Coefficient of variation (CV)0.2106210863
Kurtosis2.615260843
Mean146.9733333
Median Absolute Deviation (MAD)15
Skewness1.420392372
Sum22046
Variance958.2543177
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
145 6 4.0%
 
128 6 4.0%
 
124 6 4.0%
 
132 5 3.3%
 
140 5 3.3%
 
150 5 3.3%
 
138 5 3.3%
 
143 4 2.7%
 
108 4 2.7%
 
125 4 2.7%
 
Other values (62) 100 66.7%
 
ValueCountFrequency (%) 
90 1 0.7%
 
93 1 0.7%
 
98 1 0.7%
 
108 4 2.7%
 
110 3 2.0%
 
ValueCountFrequency (%) 
267 1 0.7%
 
260 1 0.7%
 
241 1 0.7%
 
234 1 0.7%
 
231 1 0.7%
 

FVC_mother
Real number (ℝ≥0)

Distinct count110
Unique (%)73.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean350.23333333333335
Minimum206
Maximum567
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum206
5-th percentile248.15
Q1306
median349.5
Q3396.25
95-th percentile440.2
Maximum567
Range361
Interquartile range (IQR)90.25

Descriptive statistics

Standard deviation60.42727246
Coefficient of variation (CV)0.172534327
Kurtosis0.5595078234
Mean350.2333333
Median Absolute Deviation (MAD)44.5
Skewness0.2800754733
Sum52535
Variance3651.455257
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
362 4 2.7%
 
357 4 2.7%
 
349 4 2.7%
 
301 3 2.0%
 
309 3 2.0%
 
333 3 2.0%
 
418 2 1.3%
 
407 2 1.3%
 
316 2 1.3%
 
319 2 1.3%
 
Other values (100) 121 80.7%
 
ValueCountFrequency (%) 
206 1 0.7%
 
215 1 0.7%
 
232 1 0.7%
 
233 1 0.7%
 
242 1 0.7%
 
ValueCountFrequency (%) 
567 1 0.7%
 
508 1 0.7%
 
496 1 0.7%
 
492 1 0.7%
 
448 1 0.7%
 

FEV1_mother
Real number (ℝ≥0)

Distinct count104
Unique (%)69.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean297.31333333333333
Minimum175
Maximum460
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum175
5-th percentile216.35
Q1263.25
median299
Q3328
95-th percentile375
Maximum460
Range285
Interquartile range (IQR)64.75

Descriptive statistics

Standard deviation48.74135775
Coefficient of variation (CV)0.1639393605
Kurtosis0.5309313002
Mean297.3133333
Median Absolute Deviation (MAD)32
Skewness0.125627363
Sum44597
Variance2375.719955
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
256 4 2.7%
 
331 4 2.7%
 
305 4 2.7%
 
325 4 2.7%
 
307 4 2.7%
 
313 3 2.0%
 
294 3 2.0%
 
295 3 2.0%
 
296 3 2.0%
 
289 3 2.0%
 
Other values (94) 115 76.7%
 
ValueCountFrequency (%) 
175 1 0.7%
 
186 1 0.7%
 
194 1 0.7%
 
202 1 0.7%
 
204 1 0.7%
 
ValueCountFrequency (%) 
460 1 0.7%
 
426 1 0.7%
 
425 1 0.7%
 
389 1 0.7%
 
380 2 1.3%
 

Sex_oldest_child
Categorical

UNIFORM
Distinct count2
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
1
75
2
75
ValueCountFrequency (%) 
1 75 50.0%
 
2 75 50.0%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 2 100.0%
 
ValueCountFrequency (%) 
Common 2 100.0%
 
ValueCountFrequency (%) 
ASCII 2 100.0%
 

Age_oldest_child
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count11
Unique (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.633333333333333
Minimum7
Maximum17
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum7
5-th percentile8
Q110
median13
Q316
95-th percentile17
Maximum17
Range10
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.19062525
Coefficient of variation (CV)0.2525560884
Kurtosis-1.307855709
Mean12.63333333
Median Absolute Deviation (MAD)3
Skewness-0.1380256726
Sum1895
Variance10.18008949
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
16 21 14.0%
 
17 21 14.0%
 
8 16 10.7%
 
11 15 10.0%
 
14 14 9.3%
 
9 13 8.7%
 
13 13 8.7%
 
10 11 7.3%
 
12 11 7.3%
 
15 10 6.7%
 
ValueCountFrequency (%) 
7 5 3.3%
 
8 16 10.7%
 
9 13 8.7%
 
10 11 7.3%
 
11 15 10.0%
 
ValueCountFrequency (%) 
17 21 14.0%
 
16 21 14.0%
 
15 10 6.7%
 
14 14 9.3%
 
13 13 8.7%
 

Height_oldest_child
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count27
Unique (%)18.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60.093333333333334
Minimum46
Maximum75
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum46
5-th percentile49
Q154
median60.5
Q366
95-th percentile70
Maximum75
Range29
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.927570312
Coefficient of variation (CV)0.1152801805
Kurtosis-1.006874895
Mean60.09333333
Median Absolute Deviation (MAD)5.5
Skewness-0.1291893659
Sum9014
Variance47.99123043
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
66 12 8.0%
 
63 10 6.7%
 
59 10 6.7%
 
61 9 6.0%
 
60 8 5.3%
 
54 8 5.3%
 
56 7 4.7%
 
67 7 4.7%
 
69 7 4.7%
 
55 6 4.0%
 
Other values (17) 66 44.0%
 
ValueCountFrequency (%) 
46 2 1.3%
 
48 3 2.0%
 
49 6 4.0%
 
50 6 4.0%
 
51 5 3.3%
 
ValueCountFrequency (%) 
75 1 0.7%
 
72 3 2.0%
 
71 3 2.0%
 
70 5 3.3%
 
69 7 4.7%
 

Weight_oldest_child
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count89
Unique (%)59.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102.62
Minimum45
Maximum207
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum45
5-th percentile52.45
Q169
median101.5
Q3125
95-th percentile171.55
Maximum207
Range162
Interquartile range (IQR)56

Descriptive statistics

Standard deviation38.08709024
Coefficient of variation (CV)0.3711468548
Kurtosis-0.4244497572
Mean102.62
Median Absolute Deviation (MAD)29.5
Skewness0.5283933259
Sum15393
Variance1450.626443
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
104 6 4.0%
 
117 6 4.0%
 
115 5 3.3%
 
66 4 2.7%
 
64 4 2.7%
 
122 4 2.7%
 
55 4 2.7%
 
111 3 2.0%
 
83 3 2.0%
 
61 3 2.0%
 
Other values (79) 108 72.0%
 
ValueCountFrequency (%) 
45 1 0.7%
 
47 1 0.7%
 
49 1 0.7%
 
50 3 2.0%
 
52 2 1.3%
 
ValueCountFrequency (%) 
207 1 0.7%
 
206 1 0.7%
 
186 1 0.7%
 
185 1 0.7%
 
180 1 0.7%
 

FVC_oldest_child
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count128
Unique (%)85.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean304.25333333333333
Minimum107
Maximum689
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum107
5-th percentile148
Q1201
median291
Q3376
95-th percentile524.3
Maximum689
Range582
Interquartile range (IQR)175

Descriptive statistics

Standard deviation126.3934513
Coefficient of variation (CV)0.415421747
Kurtosis0.0001219447841
Mean304.2533333
Median Absolute Deviation (MAD)89.5
Skewness0.6840419162
Sum45638
Variance15975.30452
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
321 3 2.0%
 
296 2 1.3%
 
353 2 1.3%
 
240 2 1.3%
 
361 2 1.3%
 
376 2 1.3%
 
128 2 1.3%
 
148 2 1.3%
 
150 2 1.3%
 
392 2 1.3%
 
Other values (118) 129 86.0%
 
ValueCountFrequency (%) 
107 1 0.7%
 
114 1 0.7%
 
118 1 0.7%
 
119 1 0.7%
 
128 2 1.3%
 
ValueCountFrequency (%) 
689 1 0.7%
 
680 1 0.7%
 
604 1 0.7%
 
581 1 0.7%
 
567 1 0.7%
 

FEV1_oldest_child
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count131
Unique (%)87.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean261.98
Minimum98
Maximum545
Zeros0
Zeros (%)0.0%
Memory size1.3 KiB

Quantile statistics

Minimum98
5-th percentile123.15
Q1175.5
median248.5
Q3323
95-th percentile453
Maximum545
Range447
Interquartile range (IQR)147.5

Descriptive statistics

Standard deviation106.1139991
Coefficient of variation (CV)0.4050461833
Kurtosis-0.3612562708
Mean261.98
Median Absolute Deviation (MAD)74.5
Skewness0.6009503645
Sum39297
Variance11260.18081
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
155 4 2.7%
 
142 3 2.0%
 
136 2 1.3%
 
233 2 1.3%
 
453 2 1.3%
 
203 2 1.3%
 
311 2 1.3%
 
220 2 1.3%
 
279 2 1.3%
 
277 2 1.3%
 
Other values (121) 127 84.7%
 
ValueCountFrequency (%) 
98 1 0.7%
 
108 1 0.7%
 
110 1 0.7%
 
111 1 0.7%
 
114 1 0.7%
 
ValueCountFrequency (%) 
545 1 0.7%
 
544 1 0.7%
 
521 1 0.7%
 
499 1 0.7%
 
474 1 0.7%
 

Sex_middle_child
Categorical

Distinct count3
Unique (%)2.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
76
2
43
1
31
ValueCountFrequency (%) 
. 76 50.7%
 
2 43 28.7%
 
1 31 20.7%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 2 66.7%
 
Other_Punctuation 1 33.3%
 
ValueCountFrequency (%) 
Common 3 100.0%
 
ValueCountFrequency (%) 
ASCII 3 100.0%
 

Age_middle_child
Categorical

Distinct count12
Unique (%)8.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
76
13
 
14
8
 
12
12
 
10
10
 
8
Other values (7)
30
ValueCountFrequency (%) 
. 76 50.7%
 
13 14 9.3%
 
8 12 8.0%
 
12 10 6.7%
 
10 8 5.3%
 
11 7 4.7%
 
9 6 4.0%
 
15 5 3.3%
 
7 5 3.3%
 
14 3 2.0%
 
Other values (2) 4 2.7%
 

Length

Max length2
Mean length1.34
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 
Distinct count23
Unique (%)15.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
76
54
 
9
57
 
7
50
 
6
49
 
5
Other values (18)
47
ValueCountFrequency (%) 
. 76 50.7%
 
54 9 6.0%
 
57 7 4.7%
 
50 6 4.0%
 
49 5 3.3%
 
62 5 3.3%
 
61 5 3.3%
 
64 4 2.7%
 
65 4 2.7%
 
60 4 2.7%
 
Other values (13) 25 16.7%
 

Length

Max length2
Mean length1.493333333
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 
Distinct count49
Unique (%)32.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
76
85
 
5
55
 
5
68
 
3
61
 
3
Other values (44)
58
ValueCountFrequency (%) 
. 76 50.7%
 
85 5 3.3%
 
55 5 3.3%
 
68 3 2.0%
 
61 3 2.0%
 
72 3 2.0%
 
113 2 1.3%
 
51 2 1.3%
 
145 2 1.3%
 
58 2 1.3%
 
Other values (39) 47 31.3%
 

Length

Max length3
Mean length1.673333333
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

FVC_middle_child
Categorical

HIGH CARDINALITY
Distinct count67
Unique (%)44.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
76
189
 
2
266
 
2
257
 
2
264
 
2
Other values (62)
66
ValueCountFrequency (%) 
. 76 50.7%
 
189 2 1.3%
 
266 2 1.3%
 
257 2 1.3%
 
264 2 1.3%
 
202 2 1.3%
 
276 2 1.3%
 
296 2 1.3%
 
236 2 1.3%
 
208 1 0.7%
 
Other values (57) 57 38.0%
 

Length

Max length3
Mean length1.986666667
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

FEV1_middle_child
Categorical

HIGH CARDINALITY
Distinct count67
Unique (%)44.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
76
212
 
2
165
 
2
136
 
2
218
 
2
Other values (62)
66
ValueCountFrequency (%) 
. 76 50.7%
 
212 2 1.3%
 
165 2 1.3%
 
136 2 1.3%
 
218 2 1.3%
 
251 2 1.3%
 
293 2 1.3%
 
264 2 1.3%
 
169 2 1.3%
 
287 1 0.7%
 
Other values (57) 57 38.0%
 

Length

Max length3
Mean length1.98
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Sex_youngest_child
Categorical

HIGH CORRELATION
Distinct count3
Unique (%)2.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
126
1
 
15
2
 
9
ValueCountFrequency (%) 
. 126 84.0%
 
1 15 10.0%
 
2 9 6.0%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 2 66.7%
 
Other_Punctuation 1 33.3%
 
ValueCountFrequency (%) 
Common 3 100.0%
 
ValueCountFrequency (%) 
ASCII 3 100.0%
 

Age_youngest_child
Categorical

HIGH CORRELATION
Distinct count9
Unique (%)6.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
126
9
 
5
10
 
5
12
 
4
7
 
3
Other values (4)
 
7
ValueCountFrequency (%) 
. 126 84.0%
 
9 5 3.3%
 
10 5 3.3%
 
12 4 2.7%
 
7 3 2.0%
 
11 3 2.0%
 
14 2 1.3%
 
15 1 0.7%
 
13 1 0.7%
 

Length

Max length2
Mean length1.106666667
Min length1
ValueCountFrequency (%) 
Decimal_Number 8 88.9%
 
Other_Punctuation 1 11.1%
 
ValueCountFrequency (%) 
Common 9 100.0%
 
ValueCountFrequency (%) 
ASCII 9 100.0%
 

Height_youngest_child
Categorical

HIGH CORRELATION
Distinct count14
Unique (%)9.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
126
56
 
4
50
 
3
54
 
3
58
 
2
Other values (9)
 
12
ValueCountFrequency (%) 
. 126 84.0%
 
56 4 2.7%
 
50 3 2.0%
 
54 3 2.0%
 
58 2 1.3%
 
60 2 1.3%
 
55 2 1.3%
 
67 2 1.3%
 
66 1 0.7%
 
46 1 0.7%
 
Other values (4) 4 2.7%
 

Length

Max length2
Mean length1.16
Min length1
ValueCountFrequency (%) 
Decimal_Number 8 88.9%
 
Other_Punctuation 1 11.1%
 
ValueCountFrequency (%) 
Common 9 100.0%
 
ValueCountFrequency (%) 
ASCII 9 100.0%
 

Weight_youngest_child
Categorical

HIGH CORRELATION
Distinct count23
Unique (%)15.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
126
65
 
2
70
 
2
51
 
1
68
 
1
Other values (18)
 
18
ValueCountFrequency (%) 
. 126 84.0%
 
65 2 1.3%
 
70 2 1.3%
 
51 1 0.7%
 
68 1 0.7%
 
91 1 0.7%
 
66 1 0.7%
 
124 1 0.7%
 
76 1 0.7%
 
87 1 0.7%
 
Other values (13) 13 8.7%
 

Length

Max length3
Mean length1.2
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

FVC_youngest_child
Categorical

HIGH CORRELATION
Distinct count23
Unique (%)15.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
126
194
 
2
257
 
2
460
 
1
197
 
1
Other values (18)
 
18
ValueCountFrequency (%) 
. 126 84.0%
 
194 2 1.3%
 
257 2 1.3%
 
460 1 0.7%
 
197 1 0.7%
 
438 1 0.7%
 
192 1 0.7%
 
355 1 0.7%
 
220 1 0.7%
 
222 1 0.7%
 
Other values (13) 13 8.7%
 

Length

Max length3
Mean length1.32
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

FEV1_youngest_child
Categorical

HIGH CORRELATION
Distinct count24
Unique (%)16.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
.
126
191
 
2
216
 
1
273
 
1
209
 
1
Other values (19)
 
19
ValueCountFrequency (%) 
. 126 84.0%
 
191 2 1.3%
 
216 1 0.7%
 
273 1 0.7%
 
209 1 0.7%
 
372 1 0.7%
 
213 1 0.7%
 
393 1 0.7%
 
183 1 0.7%
 
142 1 0.7%
 
Other values (14) 14 9.3%
 

Length

Max length3
Mean length1.32
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

IDAREA_CODEGender_fatherAge_fatherHeight_father_inchWeight_father_lbFVC_fatherFEV1_fatherGender_motherAge_motherHeight_mother_inchWeight_mother_lbFVC_motherFEV1_motherSex_oldest_childAge_oldest_childHeight_oldest_childWeight_oldest_childFVC_oldest_childFEV1_oldest_childSex_middle_childAge_middle_childHeight_middle_childWeight_middle_childFVC_middle_childFEV1_middle_childSex_youngest_childAge_youngest_childHeight_youngest_childWeight_youngest_childFVC_youngest_childFEV1_youngest_child
011153611613913232436213637033121259115296279............
12114072198441395238661604113471105666323239............
2311266921044534722759114309265185059114111............
341134681874333742365812326520621157106256185194956159130......
4511466112135429023962128245233116618826024721260852682342105053154143
56114472153610491236661253493061156710038935511357872762372105572195169
67113564145345339227682064924252115470218163............
781145691664844192456311534227111567153460388295481193187......
89114568180489429241681443573132146514428927211262108257235......
91011306616655044922667156364345294952192142............

Last rows

IDAREA_CODEGender_fatherAge_fatherHeight_father_inchWeight_father_lbFVC_fatherFEV1_fatherGender_motherAge_motherHeight_mother_inchWeight_mother_lbFVC_motherFEV1_motherSex_oldest_childAge_oldest_childHeight_oldest_childWeight_oldest_childFVC_oldest_childFEV1_oldest_childSex_middle_childAge_middle_childHeight_middle_childWeight_middle_childFVC_middle_childFEV1_middle_childSex_youngest_childAge_youngest_childHeight_youngest_childWeight_youngest_childFVC_youngest_childFEV1_youngest_child
140141412868198534451227642204063481105591259202............
14114241446714940134723964143306253295160118114............
14214341367019144035923367142378332285267160139............
1431444144712204623762366426730129611769136452407............
1441454140732145374482356113030726311463117346279285268182145......
1451464153691624413402506514727924521766127360347............
146147413772195473418237641454043461115566223169............
147148413967181549450233631323603131115696232211184955181165......
14814941366612949537422960110380325174649161136............
1491504148641703512922446315035828921561115353285............